AITopics | supervised classifier

Collaborating Authors

supervised classifier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Label Noise Cleaning for Supervised Classification via Bernoulli Random Sampling

Liu, Yuxin, Jin, Xiong, Han, Yang

arXiv.org Machine LearningMar-17-2026

Label noise - incorrect labels assigned to observations - can substantially degrade the performance of supervised classifiers. This paper proposes a label noise cleaning method based on Bernoulli random sampling. We show that the mean label noise levels of subsets generated by Bernoulli random sampling containing a given observation are identically distributed for all clean observations, and identically distributed, with a different distribution, for all noisy observations. Although the mean label noise levels are not independent across observations, by introducing an independent coupling we further prove that they converge to a mixture of two well-separated distributions corresponding to clean and noisy observations. By establishing a linear model between cross-validated classification errors and label noise levels, we are able to approximate this mixture distribution and thereby separate clean and noisy observations without any prior label information. The proposed method is classifier-agnostic, theoretically justified, and demonstrates strong performance on both simulated and real datasets.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Machine Learning

2603.14387

Country:

Europe > United Kingdom > England > Greater Manchester > Manchester (0.40)
North America > United States > Wisconsin (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

AT-CXR: Uncertainty-Aware Agentic Triage for Chest X-rays

Li, Xueyang, Jiang, Mingze, Xu, Gelei, Xia, Jun, Jia, Mengzhao, Chen, Danny, Shi, Yiyu

arXiv.org Artificial IntelligenceAug-28-2025

Agentic AI is advancing rapidly, yet truly autonomous medical-imaging triage, where a system decides when to stop, escalate, or defer under real constraints, remains relatively underexplored. To address this gap, we introduce AT-CXR, an uncertainty-aware agent for chest X-rays. The system estimates per-case confidence and distributional fit, then follows a stepwise policy to issue an automated decision or abstain with a suggested label for human intervention. We evaluate two router designs that share the same inputs and actions: a deterministic rule-based router and an LLM-decided router. Across five-fold evaluation on a balanced subset of NIH ChestX-ray14 dataset, both variants outperform strong zero-shot vision-language models and state-of-the-art supervised classifiers, achieving higher full-coverage accuracy and superior selective-prediction performance, evidenced by a lower area under the risk-coverage curve (AURC) and a lower error rate at high coverage, while operating with lower latency that meets practical clinical constraints. The two routers provide complementary operating points, enabling deployments to prioritize maximal throughput or maximal accuracy. Our code is available at https://github.com/XLIAaron/uncertainty-aware-cxr-agent.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.19322

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

Xu, Yang, Wang, Yu, An, Hao, Liu, Zhichen, Li, Yongyuan

arXiv.org Artificial IntelligenceJun-28-2024

Human and model-generated texts can be distinguished by examining the magnitude of likelihood in language. However, it is becoming increasingly difficult as language model's capabilities of generating human-like texts keep evolving. This study provides a new perspective by using the relative likelihood values instead of absolute ones, and extracting useful features from the spectrum-view of likelihood for the human-model text detection task. We propose a detection procedure with two classification methods, supervised and heuristic-based, respectively, which results in competitive performances with previous zero-shot detection methods and a new state-of-the-art on short-text detection. Our method can also reveal subtle differences between human and model languages, which find theoretical roots in psycholinguistics studies. Our code is available at https://github.com/CLCS-SUSTech/FourierGPT

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.19874

Country: Europe (0.46)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.66)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Knowledge Distillation in Automated Annotation: Supervised Text Classification with LLM-Generated Training Labels

Pangakis, Nicholas, Wolken, Samuel

arXiv.org Artificial IntelligenceJun-25-2024

Computational social science (CSS) practitioners often rely on human-labeled data to fine-tune supervised text classifiers. We assess the potential for researchers to augment or replace human-generated training data with surrogate training labels from generative large language models (LLMs). We introduce a recommended workflow and test this LLM application by replicating 14 classification tasks and measuring performance. We employ a novel corpus of English-language text classification data sets from recent CSS articles in high-impact journals. Because these data sets are stored in password-protected archives, our analyses are less prone to issues of contamination. For each task, we compare supervised classifiers fine-tuned using GPT-4 labels against classifiers fine-tuned with human annotations and against labels from GPT-4 and Mistral-7B with few-shot in-context learning. Our findings indicate that supervised classification models fine-tuned on LLM-generated labels perform comparably to models fine-tuned with labels from human annotators. Fine-tuning models using LLM-generated labels can be a fast, efficient and cost-effective method of building supervised text classifiers.

annotation, classifier, text sample, (15 more...)

arXiv.org Artificial Intelligence

2406.17633

Country:

North America > United States > Pennsylvania (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Government > Immigration & Customs (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Residual ANODE

Das, Ranit, Kasieczka, Gregor, Shih, David

arXiv.org Artificial IntelligenceDec-18-2023

We present R-ANODE, a new method for data-driven, model-agnostic resonant anomaly detection that raises the bar for both performance and interpretability. The key to R-ANODE is to enhance the inductive bias of the anomaly detection task by fitting a normalizing flow directly to the small and unknown signal component, while holding fixed a background model (also a normalizing flow) learned from sidebands. In doing so, R-ANODE is able to outperform all classifier-based, weakly-supervised approaches, as well as the previous ANODE method which fit a density estimator to all of the data in the signal region instead of just the signal. We show that the method works equally well whether the unknown signal fraction is learned or fixed, and is even robust to signal fraction misspecification. Finally, with the learned signal model we can sample and gain qualitative insights into the underlying anomaly, which greatly enhances the interpretability of resonant anomaly detection and offers the possibility of simultaneously discovering and characterizing the new physics that could be hiding in the data.

arxiv, r-anode, sig, (14 more...)

arXiv.org Artificial Intelligence

2312.11629

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > Germany > Hamburg (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Ensembling Uncertainty Measures to Improve Safety of Black-Box Classifiers

Zoppi, Tommaso, Ceccarelli, Andrea, Bondavalli, Andrea

arXiv.org Artificial IntelligenceAug-23-2023

Machine Learning (ML) algorithms that perform classification may predict the wrong class, experiencing misclassifications. It is well-known that misclassifications may have cascading effects on the encompassing system, possibly resulting in critical failures. This paper proposes SPROUT, a Safety wraPper thROugh ensembles of UncertainTy measures, which suspects misclassifications by computing uncertainty measures on the inputs and outputs of a black-box classifier. If a misclassification is detected, SPROUT blocks the propagation of the output of the classifier to the encompassing system. The resulting impact on safety is that SPROUT transforms erratic outputs (misclassifications) into data omission failures, which can be easily managed at the system level. SPROUT has a broad range of applications as it fits binary and multi-class classification, comprising image and tabular datasets. We experimentally show that SPROUT always identifies a huge fraction of the misclassifications of supervised classifiers, and it is able to detect all misclassifications in specific cases. SPROUT implementation contains pre-trained wrappers, it is publicly available and ready to be deployed with minimal effort.

artificial intelligence, classifier, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.12065

Country:

North America > United States (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Transportation > Air (0.61)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Stance Detection With Supervised, Zero-Shot, and Few-Shot Applications

Burnham, Michael

arXiv.org Artificial IntelligenceMay-2-2023

Stance detection is the identification of an author's beliefs about a subject from a document. Researchers widely rely on sentiment analysis to accomplish this. However, recent research has show that sentiment analysis is only loosely correlated with stance, if at all. This paper advances methods in text analysis by precisely defining the task of stance detection, providing a generalized framework for the task, and then presenting three distinct approaches for performing stance detection: supervised classification, zero-shot classification with NLI classifiers, and in-context learning. In doing so, I demonstrate how zero-shot and few-shot language classifiers can replace human labelers for a variety of tasks and discuss how their application and limitations differ from supervised classifiers. Finally, I demonstrate an application of zero-shot stance detection by replicating Block Jr et al. (2022).

large language model, natural language, tweet, (16 more...)

arXiv.org Artificial Intelligence

2305.01723

Country:

Asia > Russia (0.46)
Europe > Eastern Europe (0.04)
Europe > Ukraine (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Government > Regional Government (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.99)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Self-Training of Halfspaces with Generalization Guarantees under Massart Mislabeling Noise Model

Hadjadj, Lies, Amini, Massih-Reza, Louhichi, Sana, Deschamps, Alexis

arXiv.org Machine LearningDec-2-2021

We investigate the generalization properties of a self-training algorithm with halfspaces. The approach learns a list of halfspaces iteratively from labeled and unlabeled training data, in which each iteration consists of two steps: exploration and pruning. In the exploration phase, the halfspace is found sequentially by maximizing the unsigned-margin among unlabeled examples and then assigning pseudo-labels to those that have a distance higher than the current threshold. The pseudo-labeled examples are then added to the training set, and a new classifier is learned. This process is repeated until no more unlabeled examples remain for pseudo-labeling. In the pruning phase, pseudo-labeled samples that have a distance to the last halfspace greater than the associated unsigned-margin are then discarded. We prove that the misclassification error of the resulting sequence of classifiers is bounded and show that the resulting semi-supervised approach never degrades performance compared to the classifier learned using only the initial labeled training set. Experiments carried out on a variety of benchmarks demonstrate the efficiency of the proposed approach compared to state-of-the-art methods.

algorithm, halfspace, misclassification error, (15 more...)

arXiv.org Machine Learning

2111.14427

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Oceania > Australia (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Vision-based Driver Assistance Systems: Survey, Taxonomy and Advances

Horgan, Jonathan, Hughes, Ciarán, McDonald, John, Yogamani, Senthil

arXiv.org Artificial IntelligenceApr-26-2021

Vision-based driver assistance systems is one of the rapidly growing research areas of ITS, due to various factors such as the increased level of safety requirements in automotive, computational power in embedded systems, and desire to get closer to autonomous driving. It is a cross disciplinary area encompassing specialised fields like computer vision, machine learning, robotic navigation, embedded systems, automotive electronics and safety critical software. In this paper, we survey the list of vision based advanced driver assistance systems with a consistent terminology and propose a taxonomy. We also propose an abstract model in an attempt to formalize a top-down view of application development to scale towards autonomous driving system.

application, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ITSC.2015.329

2104.12583

Country:

North America > United States > Texas > Lubbock County > Lubbock (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Ireland (0.04)

Genre:

Research Report (0.50)
Overview (0.46)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Uncertain Time Series Classification With Shapelet Transform

Mbouopda, Michael Franklin, Nguifo, Engelbert Mephu

arXiv.org Machine LearningFeb-3-2021

Time series classification is a task that aims at classifying chronological data. It is used in a diverse range of domains such as meteorology, medicine and physics. In the last decade, many algorithms have been built to perform this task with very appreciable accuracy. However, applications where time series have uncertainty has been under-explored. Using uncertainty propagation techniques, we propose a new uncertain dissimilarity measure based on Euclidean distance. We then propose the uncertain shapelet transform algorithm for the classification of uncertain time series. The large experiments we conducted on state of the art datasets show the effectiveness of our contribution. The source code of our contribution and the datasets we used are all available on a public repository.

classification, dataset, time sery, (14 more...)

arXiv.org Machine Learning

2102.0209

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France > Auvergne-Rhône-Alpes > Puy-de-Dôme > Clermont-Ferrand (0.04)
Asia (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.95)

Add feedback